Skip to content

Conversation

andyyang890
Copy link
Collaborator

This patch fixes a bug where the memory monitor wouldn't reclaim the
memory allocated to events corresponding to unwatched column families
for a changefeed that targets only a subset of a table's families.

Fixes #154776

Release note (bug fix): A bug where a changefeed targeting only a subset
of a table's column families could become stuck has been fixed.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andyyang890 andyyang890 force-pushed the 20251003-fix-alloc-leak branch from fb86bc0 to 3804925 Compare October 4, 2025 02:01
@andyyang890 andyyang890 changed the title changefeedccl: fix unwatched column families memory bug changefeedccl: fix unwatched column families memory monitoring bug Oct 4, 2025
@andyyang890 andyyang890 added backport-24.1.x Flags PRs that need to be backported to 24.1. backport-24.3.x Flags PRs that need to be backported to 24.3 backport-25.2.x Flags PRs that need to be backported to 25.2 backport-25.3.x Flags PRs that need to be backported to 25.3 backport-25.4.x Flags PRs that need to be backported to 25.4 labels Oct 4, 2025
@andyyang890 andyyang890 force-pushed the 20251003-fix-alloc-leak branch 3 times, most recently from 445e667 to fc8482b Compare October 4, 2025 12:54
@andyyang890 andyyang890 requested a review from aerfrei October 4, 2025 13:01
@andyyang890 andyyang890 marked this pull request as ready for review October 4, 2025 13:01
@andyyang890 andyyang890 requested a review from a team as a code owner October 4, 2025 13:01
}

// Update watched column again to verify the feed is still progressing.
// If the memory allocations leaked, this assertion would time out.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to get this to fail with a clearer symptom than a timeout (at least for enterprise feeds)? This is my first time looking at the memory monitor, but my sense was that since this would set a low PerChangefeedMemLimit, that the memory monitors should fail the feed once the limit is passed. But when I tried to check that on the status of the feed in this test, I saw that it was still running.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the allocations leaked, the memory monitor wouldn't fail; instead, the changefeed would get stuck, which is why the assertion would time out. I added some more explanatory comments to the test.

@andyyang890 andyyang890 force-pushed the 20251003-fix-alloc-leak branch from fc8482b to a14e13a Compare October 6, 2025 16:52
@andyyang890 andyyang890 requested a review from asg0451 October 6, 2025 16:53
This patch fixes a bug where the memory monitor wouldn't reclaim the
memory allocated to events corresponding to unwatched column families
for a changefeed that targets only a subset of a table's families.

Release note (bug fix): A bug where a changefeed targeting only a subset
of a table's column families could become stuck has been fixed.
@andyyang890 andyyang890 force-pushed the 20251003-fix-alloc-leak branch from a14e13a to 4f701c2 Compare October 6, 2025 19:53
@andyyang890
Copy link
Collaborator Author

TFTRs!

bors r=aerfrei,asg0451

@craig craig bot merged commit 1e387fd into cockroachdb:master Oct 7, 2025
24 checks passed
@craig
Copy link
Contributor

craig bot commented Oct 7, 2025

Copy link

blathers-crl bot commented Oct 7, 2025

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


💡 Consider backporting to the fork repo instead of the main repo. See instructions for more details.

error creating merge commit from 4f701c2 to blathers/backport-release-24.1-154802: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 24.1.x failed. See errors above.


💡 Consider backporting to the fork repo instead of the main repo. See instructions for more details.

error creating merge commit from 4f701c2 to blathers/backport-release-24.3-154802: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 24.3.x failed. See errors above.


💡 Consider backporting to the fork repo instead of the main repo. See instructions for more details.

error creating merge commit from 4f701c2 to blathers/backport-release-25.2-154802: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 25.2.x failed. See errors above.


💡 Consider backporting to the fork repo instead of the main repo. See instructions for more details.

error creating merge commit from 4f701c2 to blathers/backport-release-25.3-154802: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 25.3.x failed. See errors above.


💡 Consider backporting to the fork repo instead of the main repo. See instructions for more details.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-24.1.x Flags PRs that need to be backported to 24.1. backport-24.3.x Flags PRs that need to be backported to 24.3 backport-25.2.x Flags PRs that need to be backported to 25.2 backport-25.3.x Flags PRs that need to be backported to 25.3 backport-25.4.x Flags PRs that need to be backported to 25.4 backport-failed target-release-26.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

changefeedccl: feed excluding column families leaks mem mon allocs for ignored events
4 participants